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Using Random Matrix Theory one can derive exact relations between the eigen- 
value spectrum of the covariance matrix and the eigenvalue spectrum of it's esti- 
mator (experimentally measured correlation matrix). These relations will be used 
to analyze a particular case of the correlations in financial series and to show that 
contrary to earlier claims, correlations can be measured also in the "random" part 
of the spectrum. Implications for the portfolio optimization are briefly discussed. 
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1 Introduction 

Empirically determined correlation matrices appear in many research areas. 
In financial analysis they became one of the cornerstones of the financial risk 
analysis through the idea of the optimal portfolio by Markowitz [1]. The prac- 
tical way to obtain the correlation matrix is to perform a (large) number T of 
measurements of a (large) number N of experimental quantities (price fluctu- 
ations) xu, i = 1, . . . N, t = 1, . . . T and to estimate the correlation matrix 
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where we assumed that (xj) = 0. The relation of the estimated correlation 
matrix c^- to the true correlation matrix CV,-, in particular the relation of the 
eigenvalue spectra of the two matrices was addressed in the literature (cf. e.g. 
[2,3,4,11]), assuming the experimentally measured quantities come from some 
correlated random process. We shall address this point in the next section, 
assuming these distributions come from the correlated Gaussian ensemble. 

In the context of financial analysis, the correlation matrices were discussed 
in [5,6,7,8,9,10]. The general conclusion, based on results of [6] was rather 
pessimistic. Let us quote here from the paper [8]: ... covariance matrices de- 
termined from empirical financial time series appear to contain such a high 
amount of noise that their structure can essentially be regarded as random. 
This seems, however, to be in contradiction with the fundamental role played 
by covariance matrices in finance, which constitute the pillars of modern in- 
vestment theory and have also gained industry-wide applications in risk man- 
agement. Let us recall the arguments which led to this conclusion. In the 
paper [6] the authors analyzed the S&P500 price fluctuations of iV = 406 as- 
sets in the period of T = 1309 days between 1991-1996. Price fluctuations were 
normalized to the standard deviation o~i = 1 and the correlation matrix was 
constructed, following (1). The spectrum of the 406 x 406 correlation matrix 
contains a number of large eigenvalues, the largest of which was interpreted as 
a "market". The spectrum, where the nine largest eigenvalues are missing is 
presented on figure 1. The lower part of the spectrum was interpreted as a dis- 
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Fig. 1. Eigenvalue distribution of the normalized correlation matrix following [6]. 
Nine largest eigenvalues are missing. The line represents a RMM distribution with 
a single eigenvalue. 

tribution of eigenvalues of a random uncorrelated Gaussian matrix, therefore 
lacking any information about the correlations. In the following we will try to 
dissolve this pessimism. In the next section we will present a simple model of 
correlated Gaussian fluctuations and discuss the properties of the resolvent of 
this model. We will use the Random Matrix Theory solution [11] to construct 
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explicitly the eigenvalue density function p(A) in case the spectrum of the 
correlation matrix is known. We shall also discuss the problem of obtaining 
this spectrum from the experimentally measured eigenvalue distribution. We 
will apply this method to reanalyze the financial time series discussed above. 
Summary and conclusions will be given in the last section. 



2 The correlated Gaussian fluctuations. 



Let us consider a Gaussian model, generating a matrix X of fluctuations 
x it , i = 1, . . . N, t — 1, . . . T with a matrix measure 



P(X)DX = ± exp(-iTrX T C- 1 X) (2) 

where Z is a normalization factor and C is a real symmetric positive correlation 
matrix, representing the genuine two-point correlations in the system. A single 
matrix X generated from this ensemble can be used to estimate the spectrum 
of C by (1). Matrix C can always be diagonalized and we assume the set of 
it's eigenvalues {/ij} to be known. We shall define two resolvents G(Z) and 
g(z) defined by 



G(Z) = lTr(Z-C)-\ g(z) = l(Tr(z-c)- 1 ), (3) 

where the averaging is made with the measure (2). In (3) Z and z are complex 
variables. In the limit N — > oo, T — > oo N/T = r < 1, r = const we can 
use the diagrammatic technique [11] to find the exact relation between the 
two resolvents. To take the limit for G(Z) we assume that eigenvalues appear 
in blocs with degeneracy kj = PjN, j — 1, . . . M and the number of blocs M 
stays finite. Function G[Z) is an analytic function of Z on a complex plane. It 
has poles at Z — /ij with residues pj. Similarly g(z) is an analytic function on 
the complex z plane with one or more cuts along the real axis. Discontinuities 
across these cuts are related to the eigenvalue density p(A) by 



p(X) = -lmg(\ + tO + ). (4) 

71 

The exact relation between the resolvents has a form of a duality relation 



zg{z) = ZG(Z), (5) 
where the complex arguments z and Z are related by the conformal map 
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On the "physical" sheet on the z plane z — > oo for Z — > oo and as follows 
from (6) the poles at Z = are mapped to z = oo on other Riemann sheets 
of g(z). 

The relation between g(z) and G(Z) can be used to find the eigenvalue dis- 
tribution p(A) for any set {//_,-, £> 3 -}. Using (6) we find the position of the cuts 
on the z plane by relating them to the map of the Imz = O^ 1 lines on the Z 
plane. We then find the imaginary part of the resolvent g(z) using (5). 

It is more difficult to obtain {/ij,pj} from the measure distribution p(A). The 
eigenvalue distribution is known only approximately, from one realization of 
the estimator c^. It is nevertheless possible to extract more information about 
the spectrum of the correlation matrix, than suggested by the pessimistic ap- 
proach quoted above. Both resolvents contain information about the moments 
of the correlation matrices. Expanding around Z = oo (resp. z = oo) we have 



G ^)=£^4 TrCi > g( z )=E^T^(Trc i ). (7) 

i=0 n ly i=0 L iN 

For r < 1 we have similar expansions around Z = (resp. z — 0). Duality 
relation can be viewed as relation between the moments 



M . 

Mi = Y,Pj^ and rm= d\p(X)X i . (8) 
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For the first few moments we have: 



Mi =m 1 

Mi = m 2 — rm\ 

M 3 = m 3 - 3rmim 2 + 2r 2 m^ (9) 



and 



M_i = (l -r)m_i 

M„ 2 = (1 — r) 2 m„ 2 — r(l — r)m 2 _ 1 

M_ 3 = (1 — r) 3 m_ 3 — r(l — r) 2 m_im_ 2 — ^ 2 (1 — r)m 2 _ 1 (10) 
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These relations can be used to determine the moments of the correlation ma- 
trix C from the moments of it's estimator c. It is clear that high moments will 
have larger errors. 

As an example let us reconsider the analysis of the S&P500 data presented 
in the Introduction. In the analysis we assumed six eigenvalue blocs in the 
correlation matrix, which gave the optimal fit of the experimental distribution. 



1.5 



1 .25 




0.5 1 1.5 2 2.5 3 3.5 



Fig. 2. Analysis of the eigenvalue distribution presented above. We assumed six 
eigenvalues, which gave the optimal fit of the distribution. The determined eigen- 
values are represented by blue lines and their relative height corresponds to the 
probability pj. The red line is the fit of the distribution with six eigenvalues, to be 
compared with the green line, corresponding to the fit presented before. 



3 Discussion and summary 



Experimental estimator of the correlation matrix, although containing statis- 
tical noise, can be nevertheless used to extract much more information about 
the spectrum of the true correlation matrix than is usually believed. This in- 
formation can be very important when the optimized portfolio is constructed. 
In case the matrix C is diagonal, one can derive exact relations between the 
volatility cr of the true optimal portfolio, obtained from the correlation matrix 
C, volatility or of the portfolio based on the estimate c and the volatility a 
predicted from this estimate. This relation depends on r 



V 1 — r 1 — r 

and shows that, particularly for r close to 1, the error can be quite substantial. 
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If the spectrum of the correlation matrix C is known, we can always predict 
the shape of the measured eigenvalue distribution p(A) of it's estimator c. 
Although the formulas presented above are valid in principle only for infinitely 
large matrices, in practice the predicted spectra agree very well even for the 
matrix sizes of the order 100 and below. 

It is also possible to invert the problem and to determine the the spectrum of 
C from that of the estimator c, although this step can be done only approxi- 
mately. 
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